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AMENDMENT 

A listing of the claims presented in this patent application appears below. This listing 
replaces all prior versions and listing of claims in this patent application. 

Claim 1 (currently amended): A method of identifying a compound which modulates 
binding of a natural ligand to the EGF receptor, ErbB3 or ErbB4, or which modulates signal 
transduction via by binding to the EGF receptor, ErbB2, ErbB3 or ErbB4, which method 
comprises the steps of: 

(A) assessing the stereochemical complementarity between the compound and tb© a 
molecule, wherein the molecule comprises^ 

(i) amino acids 1-621 of the EGF receptor positioned at atomic coordinates 
substantially as shown in Figure 6; 

(ii) one or more subsets of said amino acids related to the coordinates shown in 
Figure 6 by whole body translations and/or rotations; or 

(iii) amino acids present in the amino acid sequence of ErbB2, ErbB3 or ErbB4, 
which form an equivalent three-dimensional structure to that of the receptor site defined by 
amino acids 1 -62 1 of the EGF receptor positioned at atomic coordinates substantially as shown 
in Figure 6; 

(B) obtaining selecting a compound assessed in step (A) which possesses stereochemical 
complementarity to the molecule; [and] 

(C) testing the compound in vivo or in vitro for its ability to 

(i) modulate binding of a natural ligand to the EGF receptor, ErbB3 or ErbB4, or 

(ii) modulate signal transduction via by binding to the EGF receptor, ErbB2, ErbB3 or 
ErbB4rr.lh and 

(D) selecting a compound tested in step (C) that has the ability to 

(i) modulate binding of a natural ligand to the EGF receptor, ErbB3 or ErbB4, or 
(\\) modulate signal transduction by binding to the EGF receptor, ErbB2, ErbB3 or 
ErbB4, 
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Summary 

PASS (Putative Active Sites witli Splneres) is a simple computational tool that 
uses geometry to characterize regions of buried volume in proteins and to identify 
positions likely to represent binding sites based upon the size, shape, and burial extent 
of these volumes. PASS'S utility as.a predictive tool for binding site identification is 
tested by predicting known binding sites of proteins in the PDB using both complexed 
macromolecules and their corresponding apo-protein structures. The results indicate 
that PASS can serve as a front-end to fast docking. The main utility of PASS lies in the 
fact that it can analyze a moderate-size protein (- 30 kD) in under twenty seconds, 
which makes it suitable for interactive molecular modeling, protein database analysis, 
and aggressive virtual screening efforts. As a modeling tool, PASS (i) rapidly identifies 
favorable regions of the protein surface, (ii) simplifies visualization of residues 
modulating binding in these regions, and (lii) provides a means of directly visualizing 
buried volume, which is often inferred indirectly from curvature in a surface 
representation. PASS produces output in the form of standard PDB files, which are 
suitable for any modeling package, and provides script files to simplify visualization in 
Cerius2®, Insightll®, MOE®, Quanta®, RasMol®, and Sybyl®. PASS is freely available to 
all. 

Keywords: protein active site, binding site, cavity detection, buried volume, molecular 
modeling, computer-aided drug design 
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Introduction 



The identification and visualization of protein cavities is the starting point for 
many structure-based drug design (SBDD) applications. Sites of activity in proteins 
usually lie in cavities, where the binding of a substrate typically serves as a mechanisnn 
for triggering some event, such as a chemical modification or conformational change. 
Consequently, binding sites are often targeted in attempts to interrupt molecular 
processes via therapeutics. Although binding site locations are often furnished by x-ray 
data or fold recognition, tools that automatically predict these locations have become 
quite popular in SBDD, especially as front-ends to molecular docking or when alternate 
binding sites are sought [1,2]. The size and shape of protein cavities dictates the 
three-dimensional geometry of ligands that can strongly bind there; i.e. they must fit like 
a hand in glove. Thus, a minimal requirement for drug activity is that the molecule 
sterically fit the region of buried volume inscribing the active site cavity, with some 
allowance for induced fit. The determination and visualization of these volumes is 
critical in drug design, particularly since manual intervention is still fruitfully employed in 
most design scenarios. An ordinary stick representation of a protein, unfortunately, 
provides little insight regarding the location, shape, or size of its buried volumes. While 
surface representations [3, 4] are a step in the right direction, they still fall short in that 
they require the user to infer buried volumes from often-occluded void space. 
Consequently, methods for direct display of regions of buried volume in proteins have 
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become prevalent in recent years [5-11]. Moreover, as molecular docking and virtual 
screening become more predictive and prevalent, the possibility of interfacing such 
tools with functional genomics via threading or homology modeling becomes 
increasingly tempting. A versatile tool that can rapidly predict binding sites should, 
therefore, find a niche as a front-end to such automated screening efforts. This paper 
describes a program called PASS (Putative Active Sites with Spheres), which may 
serve both as an interface to virtual screening and as a visualization aid for manual 
molecular modeling. 

Methods 

The PASS algorithm Is designed to fill the cavities in a protein structure with a 
set of spheres and to identify a few of these spheres (called "active site points", ASPs) 
that most likely represent the centers of binding pockets. Crevice filling is performed in 
layers using three-point Connolly-like [3] sphere geometry. An initial coating of probe 
spheres Is calculated with the protein as substrate, then additional layers of probes are 
accreted onto the previously found probe spheres. Only probes with low solvent 
exposure are retained, and the routine finishes when an accretion layer produces no 
new buried probe spheres. Although physical arguments can be made to substantiate 
PASS'S success in binding site prediction, the algorithm itself is purely geometrical (see 
Figure 1). 
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Calculation of Probe Spheres 

PASS begins by reading the Protein Data Bank (PDB) coordinates of a target 
protein and assigning elemental atomic radii (Table 1). Since a protein with explicitly 
represented hydrogen atoms contains less interstitial volume than one without 
hydrogen, PASS assigns a few different parameter values in the two cases. By default, 
if less that 20% of the atoms in the protein PDB file are hydrogen, then all hydrogen 
atoms are removed and hydrogen-free parameters are assigned; otherwise, hydrogen 
is retained and hydrogen-inclusive parameters are assigned (Table 1). The first layer of 
probe spheres is computed by looping over all unique triplets of protein atoms and, if 
they are close enough together, calculating the two locations at which a probe sphere 
(of radius Rp^J may lie tangential to all three protein atoms (Fig. 1; Step a). Appendix 
A elucidates this three-point geometry, which is nontrivial since the radii are not 
necessarily equal. To be retained, a putative probe sphere must survive several filters 
(Fig. 1 ; Step b). The first condition is that it cannot overlap with any atoms of the 
accretion substrate. The second filter explicitly prohibits the probe from clashing with 
any protein atoms, while the third ensures that the probe be somewhat buried within the 
protein (i.e. in a binding-site-like region). In particular, each probe sphere is ascribed a 
"burial count" (BC) representing the extent to which it is excluded from solvent (Figure 
2). The BC of a probe is computed by counting the number of protein atoms that lie 
within a radius Rbc=8A of it, and the probes are filtered such that any probe sphere with 
BC less than a threshold value (BCy,,^^^,J is rejected. This threshold value was 
determined empirically, as were many of the PASS parameters, by visual inspection of 
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results for a few test systems. Our experience has been that PASS'S predictions are 
largely Insensitive to the precise values of any of its parameters. Finally, probe spheres 
are "weeded" such that no two probe centers lie any closer together than R^^^ = 1 A. 
This keeps the distribution of probe spheres from becoming clumped, which enables 
reliable prediction of active site points from the final set of probes. 



Table 1 - PASS Parameters 



Parameter 


Rprob. hydrogen-free 


1.8A 


BC,,„^,„ hydrogen-free 


55 


Rprob, with hydrogen 


1.5A 


BC.,„^,„ with hydrogen 


75 




8.0 A 




1.0A 


^aceration 


0.7 A 


R. 


2.0 A 




1.0 A 


Rasp 


8.0 A 


PW 


1100 


Elemental Radii [40] 


Hydrogen 


1.20 A 


Oxygen 


1.52 A 


Nitrogen 


1.55 A 


Carbon 


1.70 A 


Sulfur 


1.80 A 



Values of PASS parameters, which are defined 
as follows. Rp,„^, - Radius of a probe sphere. 
BC^^^ow - Threshold burial count (BC) 
distinguishing a buried probe from an exposed 
one. - Radius used to compute burial 
counts. R^„^ - Minimal separation between 
probe spheres. R,„,ito„ - Radius of probes as 
they are accreted onto existing probes. R„, D„ - 
Parameters defining the probe weight (PW) 
envelope function (see Fig, 2). R^p - Minimal 
distance between active site points (ASPs). 
PW^„ - Minimal PW for an ASP. 
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After the seminal layer of probes is computed, additional layers of spheres are 
iterativeiy accreted onto the existing probe spheres. At each iteration, a set of new 
probe spheres is computed as described above (Fig. 1; Steps c,e), but with a smaller 
probe radius (Rg^^etion - 0.7 A) and with the set of all probe spheres retained from 
previous layers as the accretion substrate. New probes, however, must still maintain a 
center-to-center distance of at least Rp^be"'' ^\ fro"^ ^^^h protein atom, i (of radius o). 
The aforementioned filters are imposed when the newly-found spheres are combined 
with those retained from previous layers (Fig. 1; Step d). PASS continues the accretion 
phase until a layer is encountered in which none of the newly-found probe spheres 
survives the filters (Fig. 1; Step f). The result of this procedure is that the cavities, 
invaginations, and internal voids in the protein are filled with a set of fairly evenly- 
spaced probe spheres, all of which are buried and none of which sterically clashes with 
the protein. Furthermore, probes lying along the protein surface are packed in ideal 
steric contact with three protein atoms. 

Active Site Point (ASP) Determination 

PASS subsequently identifies a small number of "active site points" (ASP) from 
amongst the final set of probe spheres (Fig. 1 ; Step g). These ASPs are meant to 
represent potential binding sites (i.e. centers of putative active sites) for ligands of 
arbitrary shape and polar character. Thus, PASS conservatively views a protein 
binding site as simply an invagination in the protein surface that is large enough to 
accomodate a ligand and possesses substantial solvent-excluded volume in which 
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hydrophobic ligand moieties may be burled. ASPs are accordingly selected by 
identifying the central probes in regions that contain many spheres with high BCs. In 
particular, each probe is assigned a "probe weight" (PW), which is proportional to the 
number of probe spheres in the vicinity and the extent to which they are buried. The 

probe weight of the i"" probe is given by PW(i^ ^^^0) /(h ~^;|)' where the 

7=1 

envelope function, f(r), is shown in Figure 2. This is conceptually similar to the 
solvation term of Stouten et al. [12], the premise of which Is that the solvation energy of 
an atom varies linearly with its exposure which, in turn, is proportional to the 
unoccupied volume around it. The final ASPs are determined by cycling through the 
probes In descending order of PW, keeping only those with PW > PW^^^ (=1 100) that 
are separated by a minimum distance R^gp (= 8A) from the ASPs already identified. 
Finally, the set of ASPs is rank-ordered according to PW values. These are PASS'S 
predicted binding sites. 

PASS Output 

The default PASS output consists of (i) a PDB file containing the final set of 
probe spheres, (li) a PDB file of the ASPs. and (iii) a separate PDB file for each ligand 
that was optionally read in (see below). By default, PASS "smoothes" the probe 
spheres before writing the final set of "display" probes to a PDB file. In particular, only 
probes with at least 4 display probes lying within 2.5A are written to file by default. 
Smoothing removes all but appreciable groupings of probe spheres, leaving the final 
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visualization less cluttered. Smoothing can be suppressed via the command-line flag [- 
all]. PASS also produces visualization scripts for several popular molecular modeling 
packages: namely, Cerius2®[13]. Insightir[14], M0E®[15]. Quanta®[16], RasMor[17], 
and Sybyl®[18]. These scripts, which are optionally produced via command-line flags 
(e.g. [-Insightll]), simplify visualization by automatically loading, rendering, and coloring 
the protein, probe spheres, ASPs, and ligands. PASS also displays detailed runtime 
information, including parameter settings, an account of sphere calculation and filtering 
(e.g. Table 2), and final probe sphere and ASP data, including BCs and PWs. PASS 
can also read the coordinates of bound ligands, either automatically from the protein 
PDB file (as HETATM entries with different residue names), or as separate files via the 
command-line flag [-ligand <filename.pdb>]. For each ligand. PASS computes the 
distance from each ASP to the nearest ligand atom and to the ligand center of mass. 
Other command-line options enable the user to (i) produce an enhanced set of probe 
spheres and ASPs ([-more]), (ii) repress production of the probe sphere PDB file ([- 
noprobes]), (iii) treat water molecules as part of the protein ([-water]), rather than 
ignoring them (which is the default behavior), (iv) specify an explicit output path ([-outdir 
<directory_path>]), (v) produce a set of PDB files containing subsets of the final probe 
spheres that were produced in the various layers of sphere calculation ([-layers]), and 
(vi) compute the volumes of all groupings of probe spheres left after smoothing ([- 
volume]). None of these options slows PASS noticeably except the volume calculation, 
which proceeds as follows. After probe smoothing, the final set of display probes Is 
agglomeratively clustered [19] by iteratively merging pairs of overlapping groups of 
probes until an iteration attempts to join two non-overlapping clusters. This determines 
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both the optimal number of probe groups and the identities of spheres in these groups. 
Group volumes are subsequently computed by looping over probe spheres and 
estimating the volume increments statistically. If ligand(s) are present, distances are 
computed from the center of each group (i.e. the cluster center) to (i) the nearest ligand 
atom (D^^J, and (ii) the ligand center of mass {D^oJ- 

Results 

Table 2 shows the numbers of probe spheres retained at various stages of a 
PASS calculation on thermolysin (1hyt) and is meant to provide an impression of the 
practical operation of the algorithm. In layer #1 of the probe sphere calculation, the 
protein atoms constitute the accretion substrate, and every set of three protein atoms 
lying close enough together to be simultaneously touched by a single sphere (of radius 
Rprobe) "T^ust be identified and used to determine two putative probe sphere positions. 
The number of atomic triples that must be tried is reduced by first identifying atomic 
neighborhoods. The "neighborhood" of atom "i" is the set of atoms lying close enough 
to "i" to be bridged by a single probe sphere. In layer #1 , 769,205 triples of protein 
atoms satisfied the neighborhood criterion, and 1,154,010 "bridging spheres" were 
located using these triplets. The number of bridging spheres is less than twice the 
number of atomic triples because not all triples of atoms in the accretion substrate that 
satisfy the neighborhood criterion can actually be bridged by a sphere of radius Rp^^^. 
The set of bridging spheres is then filtered according to (i) clash with the accretion 
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substrate, (ii) clash with the protein, (iii) burial count, and (iv) proximity to other probe 
spheres, in that order. After the substrate clash filter, 2,151 putative probe spheres 
remain and, since the protein is the accretion substrate in layer #1 , the same number 
remains after the protein clash filter. All but 81 1 putative probes are discarded based 
upon insufficient burial, and 360 remain after these 81 1 are "weeded" to maintain a 
mutual separation of at most R^^^-^ .0 A. Thus, 360 probe spheres are found in the 
first layer The accretion substrate for the second and subsequent "accretion" layers is 
the set of probe spheres. In layer #2, the substrate of 360 probe spheres requires that 
384 substrate triples be tested, from which 560 bridging spheres are identified. After 
applying the four filters, only 60 new probe spheres remain, bringing the total number of 
probes to 420 after layer #2. This process is repeated until layer #7, in which no new 
probe spheres are Identified, signalling the completion of probe sphere determination. 
Note that although the number of probe spheres continually grows as accretion 
precedes, the number of accretion substrate triples that must be tried in each layer 
plateaus. This is because PASS is written such that only triples of substrate atoms 
incorporating a newly-found probe sphere (or the neighbor of a freshly-weeded probe) 
are tried. As a result, PASS'S performance scales favorably with protein size 
(approximately MW^'^ over the molecular weight range in Table 3). 

PASS was first tested for its ability to identify known binding sites. Table 3 
shows the results of applying PASS to 30 protein-ligand complexes drawn from the 
PDB. The structures were chosen based upon diversity, resolution, inclusion in 
previous theoretical studies, and the existence of corresponding apo-protein x-ray 
structures in the PDB. In each case, hydrogen-free PASS parameters were assigned 
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and bound water molecules were ignored. For each PDB complex, Table 3 shows the 
number of layers of probes PASS computed prior to convergence, the final number of 
probe spheres, the number of ASPs identified for each protein structure, and the 
required CPU time. Coordinates of the known ligand(s) are used to define a binding 
site "hit." In particular, for each ASP of a particular protein, two quantities are 
computed: (i) D^^^^, the distance from the ASP to the nearest ligand atom, and (ii) D^om. 
the distance from the ASP to the ligand center of mass (COM). Any ASP with D^^^^ < 4A 
is considered a binding site "hit." The Binding Site Hits column lists the rank order of 
the ASP(s) that are considered hits, and the values in the D^^^, and D^qm columns 
correspond to these hits. For instance, the "1hvr" row in Table 3 indicates that both the 
top ASP and the second-ranked ASP lie near the site in HIV-1 protease known to bind 
XK263. In particular, the top ASP lies 1 .2 A from the nearest XK263 atom and 2.3 A 
from the COM, while the second-ranked ASP lies 0.8 A from the nearest atom and 6.3 
A from the COM. Note that ligand size impacts the D^om values, as evidenced by the 
trypsin-PTI system, which has the largest ligand (a protein) and, correspondingly, the 
largest D^qm values (- 19 A). 

Table 3 shows that PASS is able to successfully identify the locations of known 
binding sites in complexed x-ray structures. PASS located the pocket containing a 
known ligand in all but three of the 32 trials, often finding multiple binding site hits for a 
given ligand (11 times). In addition, the top-ranking ASP identified by PASS represents 
a binding site hit in 19 of the 32 trials, and one of the top three ASPs is a hit in 26 trials. 
These observations indicate that PASS can usually identify the protein cavity to which a 
ligand will bind with maximal affinity in a matter of seconds. There is a strong, but not 
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perfect, correlation between ASP rank (i.e. PW) and the volunne of the corresponding 
group of probe spheres. In fact, volunne is approximately as predictive of binding sites 
(results not shown) as ASP rank for the systems In Table 3. However, the calculation 
of volumes slows PASS noticeably for systems requiring many probe spheres (e.g. 92, 
40, and 24 seconds for 1jst, 3aah, and 1etr, respectively). 

From a drug design perspective, the analysis presented in Table 3 Is somewhat 
Immaterial, since the existence of complexed coordinates implies that at least one 
binding site location Is already known. Intuition suggests that the presence of a ligand 
In a complex might induce a more pronounced binding site cavity than would be 
present In an apo-protein structure, thereby biasing a cavity-detection algorithm like 
PASS to succeed on complexed systems. Thus, the postdiction of binding sites in PDB 
complexes does not establish the predictive utility of a tool for drug design, where one 
is lucky to have an apo x-ray structure or reliable homology model. 

A more realistic test of PASS as a tool for prediction is to try to locate known 
binding sites on the structures of proteins that are not complexed with a ligand. We 
address this predictability issue by using PASS to compute ASPs for the set of apo- 
protein structures from the PDB that correspond to complexed PDB structures in Table 
3. Apo structures were identified for as many of the systems in Table 3 as possible 
(20), and default PASS parameters were used in all calculations. A few of these PDB 
correspondences are not identical residue-by-residue because the molecules either 
were obtained from different sources (1npc/1hyt; 2apr/2er6), had residue additions or 
deletions at the termini (1swb/1stp; 1hxf/1dwd), or had incomplete or missing residues 
due to poor electron density (5dfr/4dfr; 1hxf/1dwd). For comparison, the results 
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displayed in Table 4 are presented in the same order as in Table 3. and corresponding 
PDB codes are shown. "Known" binding site positions are determined by superposing 
the native and complexed structures and computing the proximity of the ASPs (from the 
native PASS calculation) to the known ligand (from the complexed crystal structure). 
This enables binding site "hits" to be computed as in Table 3, along with the distances 
D^^^ and D^om relating the position of the known ligand to the binding site hits. Only 
backbone atoms {C.O.C^.N} were superposed and, in all but a few cases (see Table 4 

caption), all residues in the chain were used. To quantify how severely the ligand 
deforms the protein in the binding site, we computed the RMSD between superposed 
structures using only residues lying in this region. In particular, we identified both the 
set {Cj} of residues lying within 4 A of the ligand in the complex and the set {A} of 
corresponding residues in the superposed apo structure. The RMSD between {C} and 
{Aj} was then computed, using both side chain and backbone atoms for identical amino 
acids and only the backbone atoms otherwise. 

Table 4 shows that PASS can reliably predict binding site locations when only an 
apo x-ray structure is known. PASS correctly identifies the binding site in 17 of the 21 
trials in Table 4. The top-ranked ASP hits the binding site in 12 trials, and one of the 
top three ASPs is a hit in 16 trials. These observations imply that PASS may be a 
suitable front-end to virtual high throughput screening and fast docking routines. 
Furthermore, the similarity of observed hit rates between the apo-protein and 
complexed systems refutes the hypothesis that the presence of a ligand in the 
structural data is a crucial determinant of success for a cavity detection algorithm. 
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One additional option available In PASS is the generation of an enhanced set of 
probes and ASPs by running PASS in "more" nnode via the [-more] command-line flag. 
In "more" mode, the burial count threshold is slightly reduced (by 10), which typically 
has the effect of enhancing the number of probe spheres by about a factor of two and 
ASPs by a factor of two or three, at the expense of about 20-30% in cpu time. When 
the systems in Tables 3 and 4 are analyzed in "more" mode, the binding site Is detected 
in every case, with no ASP hit ranking worse than ninth. Tables 3 and 4 show (in 
parentheses) the ASP hits obtained in "more" mode for the few binding sites that the 
default PASS calculation failed to locate. Detailed inspection revealed that several of 
these default-mode misses contained an accumulation of probe spheres that fell just 
beneath the threshold defining an ASP. Running PASS in "more" mode is suggested 
when broad binding sites are anticipated (e.g. protein-protein association). 

The work of Mattos and Ringe [1 , 20] constitutes the experimental analog of 
PASS and enables the most direct comparison of PASS to experimental data. In 
particular, Mattos and Ringe have soaked elastase crystals with a variety of small 
organic solvents and crystallographically determined the corresponding protein 
structures, including bound solvent molecules. These bound organic probes are meant 
to map out potential binding hot spots on the protein and suggest favorable ligand 
moieties. This raises the question of whether their organic probes tend to cluster in 
regions identified via PASS ASPs, which are likewise meant to identify possible hot 
spots. To address this, PASS was run on elastase and the resulting ASPs were 
graphically superimposed with Ringe et al.'s organic probes, along with a set of bound 
ligands drawn from the PDB. Figure 3 shows these results. Several clusters of organic 
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probes are observed, most notably a large grouping in the active site (S1 pocket). 
Although only one organic probe lies within 8A of the top- or second-ranked ASPs. 
PASS places an ASP near four of the five largest clusters of probes. The inset to 
Figure 3 shows that the third-ranked ASP (pale blue) lies in the active site about 5A 
above the catalytic triad (whose surface is colored green). 

Figure 3 also addresses the question of whether clusters of these experimentally 
derived organic probes are more predictive of binding sites than PASS ASPs. 
Superposition of the ligands from nineteen elastase PDB complexes enables this 
comparison. All but three ligands bind in the S1 region of the known active site. The 
other three stick solely to an alternate site about 10A away (near S3'), while four 
molecules employ both sites. PASS identifies this alternate binding site via the fourth- 
ranked ASP (white); however, since only one organic probe lies in this region, this site 
cannot be identified solely on the basis of organic probe clusters. Conversely, there is a 
cluster of organic probes near the S4 binding pocket, but no ASP is placed there (this 
region is too close to the ASP in the S1 pocket). Thus, clusters of the organic probes of 
Ringe et al. and the ASPs of PASS appear comparably predictive of the known binding 
sites in elastase. It should be noted that the physical nature of the probes employed by 
PASS and by Ringe et al. are drastically different, so one should not expect Identical 
distributions of binding hot-spots in the two cases. Ringe et al. probe the protein 
surface with small, often quite polar, molecules, precisely the opposite of PASS ASPs, 
which can be thought of as large and apolar. ASPs are effectively apolar in that they 
are identified solely on the basis of cavity size, shape and burial, with no regard for e.g. 
electrostatics and hydrogen bonding. Moreover, the PASS parameters have been 
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tuned such that only a cavity of a certain critical size can sustain an ASP. Over the set 
of systems in Table 3, the smallest regions of buried volume containing an ASP are 
approximately the size of a benzene ring, while ASP regions that bind a ligand are 
typically three- to ten-fold larger than that. It is gratifying, however, that the central 
binding site (S1) is unambiguously identified by both methods. 
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Figure 3 - Comparison to Crystallographically-Determined Organic Probes 




PASS was run in "more" mode using a cross-linked structure of elastase provided by Ringe and Mattos. 
The resulting ASPs are rendered as large spheres and colored according to probe weight, PW (see 
scale). Crystallographically determined organic probes (acetonitrile, dimethylfonmamide, acetone, 
ethanol. isopropanol, hexenediol) are displayed as solid yellow sticks. Although only one organic probe 
lies within 8A of the top- or second-ranked ASP, four of the five largest clusters of organic probes lie in a 
region identified as a potential binding site by PASS. Every E.C.3.4.21.36 elastase complex in the PDB 
(19 structures, 20 ligands: 1bma, 1btu, 1eai, leas, leat, 1eau, 1ela. 1elb, 1elc, 1eld, 1ele, 1elf, 1elg, 
1esb, 1fle, line, 1jim, 1nes, 9est) was superposed onto the cross-linked elastase structure, and the 
resulting ligand overlays are shown as orange, blue, and magenta sticks {except for two protein-bound 
structures, 1eai and Ifle). The inset shows a top view of the protein surface at the active site, with the 
portion of the surface defined by the catalytic triad colored green. The third-ranked ASP (pale blue) Is 
centrally located in the active site (S1 region), while the fourth-ranked ASP (white) identifies an alternate 
binding site about 10A away (S3' region). Only 4 ligands (two of which are proteins) bind to both sites 
(colored blue). Thirteen of the twenty ligands (colored orange) bind in the S1 pocket but not in the 
alternate site. The other three ligands (1elf, 1elg, 1nes; colored magenta) bind only to the alternate site. 
Since only one organic probe lies in this region, probe clusters alone cannot identify this as a potential 
small molecule binding site. Conversely, a cluster of three organic probes lies in the S4 region, in a 
pocket that PASS failed to identify because it lies too close (i.e. < Rasp=8 A) to the S1 ASP. 
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Discussion 

PASS in a Virtual Screening Environment 

The hit rates shown in Table 4 indicate that PASS may serve as a front-end to 
virtual screening when the binding site is unknown or when alternative binding sites are 
sought. If the screening too! is fast enough that docking against nnultiple sites is 
permissible, then separate screening calculations can be run with the search space 
centered on the top few PASS ASPs. This strategy should enable identification of the 
optimal binding mode in most cases, as evidenced by the 71% hit rate to the top two 
ASPs in Table 4. A number of other screening strategies incorporating PASS are also 
possible. For instance, a more rigorous procedure could be used to select the "true" 
binding site from amongst the full set of ASP predictions. Using a docking routine with 
a more detailed scoring function, the affinity of a ligand for the different ASP regions 
can be directly compared. Thus, screening a small set of diverse probe molecules or 
fragments against all the ASPs might enable one to identify the stickiest region of the 
protein by comparing the scores of the top binders to each ASP region. A large 
database of ligands could then be computationally screened against this region. Since 
ASPs are determined using only steric size and shape, the electrostatic (ES) and 
hydrogen-bonding (HB) character of the ASP sites is arbitrary. One might, thus, search 
these sites for novel pharmacophores and construct focused combinatorial libraries 
designed to hit them. Conversely, one could use ES and HB characterization of ASP 
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regions to select sites most likely to possess affinity for a given class of compounds. 
Perhaps the most alluring aspect of PASS'S speed is that it (i) permits the expeditious 
analysis of entire structural databases (e.g. PDB. corporate), and (ii) could provide a 
suitable bridge between 3D structural modeling and ligand docking in a future drug 
design project designed to make use of genomic data. 

PASS as an Interactive Visualization Tool 

A PASS calculation on a moderate-sized protein (- 30 kD) takes less than 
twenty seconds on a single Silicon Graphics R10000 processor (Table 3). PASS is, 
therefore, fast enough to be used interactively in a molecular modeling environment, 
and has particular utility as a visualization tool for drug design. By default, PASS 
produces PDB files of probes, ASPs, and ligand(s) (when specified), which can be 
loaded and rendered separately using any molecular modeling package. Alternately, a 
full display of the PASS output can be produced in a single step (in supported modeling 
suites) by executing a PASS visualization script, which loads, renders, and colors the 
protein, probe spheres, ASPs, and ligand(s). ASP coloring denotes probe weight (PW), 
while the probe spheres can be colored according to either (i) burial count (BC), (ii) 
group identity (optionally invoked via [-group]), or (iii) the layer of accretion in which 
each was identified. Color values (0-50) are encoded onto the B-factor column of the 
output PDB files containing the probes and ASPs. In runs for which the probes are 
smoothed and grouped, an Integer specifying the group membership of each probe 
sphere is encoded onto the occupancy column of the probe PDB file. Figure 4 shows a 
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Figure 4 - PASS Visualization of RNAse A 




RNAse A (1rob) is shown in green and is rendered as a tube for clarity, while the 
cytidylic acid ligand is rendered in white sticks and is barely visible. The final probe 
spheres, which have been smoothed, are represented by snnall spheres and colored 
according to burial count. Active site points (ASPs) are rendered as larger spheres and 
colored by probe weight. The second-ranked ASP lies in the binding site. 
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standard PASS visualization in Insightirtor RNAse A (1rob), which is rendered as a 
tube for clarity. The probes are rendered as small spheres and colored according to 
BC, while the two ASPs are rendered as larger spheres and colored by PW. The 
ligand, cytidylic acid, is shown in white and is mostly occluded by probes and the 
second-ranked ASP. Because the ligand binds to a long groove in the RNAse surface 
rather than a deep pocket, the ASP lying in the true binding site has a lower PW than 
the one shown at the right, which lies in a rounder cavity. 

One advantage of PASS as a visualization tool is that displaying the ASPs 
relative to the protein enables immediate identification of regions likely to be of interest 
in drug design. Since the ASPs are centrally located In cavities, one can use the 
displayed ASPs and a distance-based criterion to quickly identify the residues 
modulating binding in these regions. For the modeling suites that support subseting 
(e.g. Insightll), the PASS visualization scripts automatically define 6 A, 8A, and 10 A 
residue-based subsets around each ASP, which facilitate the coloring and specific 
display of these regions. Figure 5 shows the 8 A subset of protein residues around the 
top-ranked ASP of trypsin (3ptb). The ASP is shown in magenta, while the probe 
spheres are colored by burial count. The residues involved in benzamidinium binding 
are captured in this subset; e.g. hydrogen-bond partners are indicated by yellow lines. 
The probe coloring clearly indicates that the mouth of the binding pocket lies to the 
right, where the probe spheres have lowest burial counts. Because PASS ASPs are 
centrally located in cavities, 6-10 A radial subseting almost always enables selective 
visualization of all the residues defining a protein cavity. 



27 



Figure 5 - Residues IVIodulating the Binding of Benzamidinium to Trypsin 




The residues lining the binding pocket of trypsin (3ptb) are rendered as sticks 
and colored according to atom type. They were selected by defining an 8 A 
residue-based zone centered on the top-ranked PASS active site point, shown 
In magenta. The bound benzamidinium is shown in white, while the probe 
spheres near the pocket are rendered as small spheres and colored according 
to burial count (BC). The BC color scale runs from blue (high BC) to red (low 
BC), with muted colors denoting Intermediate values. Dashed lines represent 
hydrogen bonds between benzamidinium and trypsin residues (D189 and 
G219), with distances measured in Angstroms. 
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By identifying multiple ASPs, PASS also suggests alternate binding sites in 
proteins for which a primary site{s) of binding has already been established. The 
pursuit of alternate binding sites is becoming increasingly prevalent in light of the 
mounting realization that many proteins have more than one biochemical role [21] and 
are likely to employ separate binding sites in performing distinct biochemical tasks. In 
addition, many enzymes have allosteric binding sites that effect catalytic activity or 
substrate binding via the induction of conformational changes upon cofactor binding 
[22]. PASS can suggest the locations of such sites. Finally, the disruption of protein- 
protein interactions forms the basis of many drug design efforts, and PASS can be used 
to identify interfacial pockets that may be suitable targets for drug binding. In particular, 
interfaces may be identified by using probe spheres to compute a difference map 
between the bound and unbound forms. This approach can be extended to quickly 
identify and visualize packing contacts in protein crystals or multimeric forms. 

PASS also facilitates the visualization of buried volumes in a protein In that the 
space occupied by the manifold of probe spheres represents this volume, which can be 
viewed and manipulated as a solid object by rendering the probes in a space-filling 
model. Mesh or solid representations of various surfaces (molecular, van der Waals, 
Connolly) are often used to visualize the shape complementarity of a protein surface for 
putative ligands or functional groups. Often these surfaces are colored according to 
some other receptor-based property, such as electrostatics, hydrogen bond propensity, 
or surface curvature. The idea is that a modeler can use this sort of display to look for 
likely ligand hot-spots on the protein by visually searching the surface for voluminous 
invaginations that are colored to indicate favorable complementarity in, say, 
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electrostatic potential. In reality, ligands only bind to regions possessing enough buried 
volume to significantly accomodate them. Hence, buried volume is a quantity of central 
importance in drug design, and the development of methods for informatively displaying 
such regions should be accorded due attention. Surface representations fail to capture 
buried volumes directly in that the user is left to infer the buried volume from void 
space, much of which is obscured from view by the surface. Likewise, colored surface 
quantities are of most interest near deep invaginations, precisely where the surface is 
most difficult to see. Unfortunately, user expertise is typically required to overcome 
such difficulties. PASS takes a more direct approach by filling the buried volumes with 
a set of unbonded atoms that represent the ASPs and probe spheres. This enables 
both the size and shape of the buried volumes to be viewed directly, either with or 
without the protein, using any molecular visualization tool. Rendering the buried 
volumes as solid allows the user to eyeball the fit of certain ligands and groups to 
potential hot-spot regions. Figure 6 shows the region of buried volume (orange) lying in 
the binding cavity of retinol binding protein (1rbp), along with the bound retinol (white), 
some surrounding residues, and the top- and third-ranked ASPs (in magenta), on the 
left and right, respectively. Information equivalent to what is color-coded onto protein 
surface displays can, in principle, be captured by property-based coloring of probe 
spheres. For instance, the user could perform a finite-difference Poisson-Boltzmann 
calculation and color the probe spheres according to electrostatic potential, <^^^. Directly 
displaying in the region of interest, rather than having to infer it from at the protein 
surface, provides a more meaningful view of electrostatics than a surface 
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representation. Favorable hydrogen- bond donor and acceptor positions can likewise be 
more meaningfully defined within the manifold of probe spheres than on a protein 
surface. Interaction-based coloring schemes are not presently automated within PASS, 
however. 

Comparative Study 

Many procedures for characterizing and visualizing protein cavities have been 
presented in the past and, while all differ substantially from PASS, comparative study 
serves to highlight some of PASS'S strengths and weaknesses. First, almost all prior 
methods identify cavity regions using some type of regular grid [2, 5, 6, 8-1 1, 23-26]. A 
grid simply provides the coordinates of points lying in cavities, which are then used in 
some fashion to identify boundaries with the protein and, for all but internal voids, with 
empty space. One disadvantage of using a grid is that its storage consumes memory 
unnecessarily. Likewise, uncertainties arise relating to the possible dependence of 
results upon grid spacing or positioning. Orientational dependence was indeed found 
in the program POCKET [9, 24], The advantage of implementing a grid is purely 
algorithmic, as there is no physical reason to use regular geometry when it is well 
known that protein packing and protein surfaces are extremely irregular [27], if not 
fractal [28]. The PASS algorithm captures this irregularity by using geometry to project 
outward from the known atomic coordinates in order to inscribe cavity regions. 
Although this sort of protein-based approach has been taken by other groups [7, 8, 29, 
30], the geometry employed in these studies differ significantly from PASS. Every point 
in a protein cavity may be thought to represent a sphere that lies exactly tangential to 

the protein surface. The radius of this sphere is the distance of closest approach, and 
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the sphere generally touches the protein at one, two, or three points (i.e. atoms). 
Several authors have used this correspondence (in reverse) to define points lying in 
cavity regions by specifying a set of probe spheres and using geonrietry (one-, two-, 
and/or three-point) to project outward from the protein atoms into the cavity region. For 
instance, cavity points have been obtained by placing tangential spheres midway 
between atoms [8, 30] and by rolling a probe sphere over the set of atomic spheres 
representing the protein [7, 10]. The resulting probe coordinates usually correspond to 
one or two points of tangency with the protein. However, the sterically optimal packing 
of a spherical probe against the protein has the probe lying tangent to exactly three 
atoms, just as a marble that is dropped onto a pile of other marbles will come to rest 
touching exactly three. Unlike any previous method, PASS uses only three-point 
geometry to obtain points lying in cavity regions. Consequently, the shape of the 
rendered manifold of PASS probes represents maximally favorable sterics. One might 
expect that positioning the probe spheres using only three-point geometry would give 
rise to a spotty distribution of probes and poorly-shaped buried volume. Practical 
experience has shown, however, that PASS produces smooth well-shaped buried 
volume manifolds (e.g. Figure 6), and that using only three-point geometry helps 
minimize the number of points required to fill protein cavities. 

The most ambiguous aspect of cavity characterization lies in deciding where to 
place the boundary between the pocket and free space; i.e. determining "sea-level" [8]. 
Several studies appearing in the literature [5, 6, 10] operate by filling fully-enclosed 
volumes (e.g. "flood fill") and, thus, require an artificial means of closing-off the mouths 
of cavities in order to define sea-level. With many other methods [8, 9, 23, 24], the 
definition of sea-level arises as a biproduct of the algorithm itself and has no physical 

significance. The work of Kuntz et al. [7] is closest in spirit to the present study with 
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regard to sea-level definition. Their method uses the Connolly surface as a substrate 
for sphere growth and rejects spheres based upon two criteria: (1) an angular 
condition, which essentially selects concave regions over flat or convex ones, and (2) a 
SA upper bound on radial sphere growth. Their radial constraint is expected to 
generate sea-level boundaries similar to those found with PASS. Unlike any other 
method of cavity detection, however, PASS explicitly defines sea-level according to a 
quantity of known physical significance, solvent accessibility, as quantified by burial 
counts (BC). 

Computational speed and ease of use are also important criteria for comparison 
and, in these categories, PASS rates favorably with all published methods. Although 
reliable speed comparison is difficult since few studies report CPU times [2, 8, 10, 26, 
30] and others report times on old processors [5, 7, 1 1, 29], the fastest CPU times 
reported in the literature belong to the LIGSITE program of Hendlich et al. [24], which 
can analyze a moderate-sized protein (at 0.5 A grid spacing) in about 15 seconds. This 
is approximately the same speed demonstrated by PASS; however, the LIGSITE CPU 
time ramps-up very steeply as the grid spacing is reduced (twelve-fold slower at 0.25 
A), and the authors provide only a cursory investigation of the dependence of their 
results upon grid scale. PASS also excels in useabillty in that it requires no startup cost 
to use because the inputs are simple and the outputs are standard. A few programs in 
the literature appear to have shared this design perspective [8, 23, 24, 29]. The input to 
PASS is restricted to a PDB file(s) specifying the protein(s) coordinates plus a few 
optional command-line flags that can be used to control more detailed behavior. PASS 
produces versatile output in the form of standard PDB files, which allows the user to 
immediately view the results using whatever modeling tool is already familiar. 



34 



Physical Underpinnings 

Although the roots of the PASS algorithm are geometrical, not statistical 
mechanical, it is useful in light of PASS'S success in identifying known binding sites to 
examine a posteriori which physical interactions (if any) are mimicked in PASS. PASS 
takes the philosophy that the task of binding site prediction is to identify regions of 
space along the protein where an arbitrary ligand might tightly bind. A physically well- 
designed algorithm should incorporate as many contributions to binding affinity as 
possible without sacrificing applicability over a wide range of ligands. Binding affinity Is 
dictated by the free energy change induced by the binding process, AG^^j^^, which is 

known to have numerous contributions, both enthalpic and entropic. While there is 
disagreement regarding some factors [31-33], sterics, electrostatics, hydrogen-bonding, 
and solvation are known to be major players [34-38]. Of course, the fine details of 
ligand size, shape, flexibility, hydrogen-bonding propensity, and polar character are 
crucial determinants of AG^^^; however, the observation that proteins usually bind 
ligands strongly at only a few sites suggests that one might be able to use coarse 
details of ligand character (e.g. size) to identify these few binding sites. Thus, PASS 
must make its predictions using only binding affinity contributions that depend upon 
coarse ligand character. Two important contributions to AG^,;^^ fit this description: 
solvation and sterics. Ligand binding is always favored entropically by the desolvation 
of molecular moieties, regardless of polarity [39]. This is because the hydration of any 
atomic group causes net ordering in the first few solvation shells of surrounding water. 
The PASS algorithm mimics this desolvation effect via the rejection of probe spheres 
based upon burial count. Likewise, the formation of steric (i.e. enthalpic van der Waals) 
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contacts between ligand and protein is generally favorable, regardless of the ligand. 
Although the steric contribution to ^G^.^^ depends upon detailed molecular shape, the 
hardness of the steric interaction precludes any ligand from binding tightly to the protein 
without adopting a configuration consistent with the size and shape of the buried 
volume. PASS includes sterics by imposing an implicit size and shape criterion upon 
which regions of buried volume can be identified as active site points (ASPs). In 
particular, a region of buried volume that Is either too small or too narrow to contain 
even a small ligand without steric clash will never contain an ASP because too few 
probe spheres will lie in the region for any one to have a large enough probe weight to 
be selected as an ASP. The PASS parameters (esp. R„ and PW^^J have been 
empirically tuned to make this distinction reliably. 

Similar arguments cannot be made regarding the electrostatic interaction, for 
instance, which may contribute either attractively or repulsively to AG^.^^, depending 

upon ligand charge and polarity. Several other programs in the literature, however, 
implement energetics in an effort to use other factors (e.g. hydrophobicity, 
electrostatics) to help identify and rank potential binding site cavities [2, 5, 26]. Most 
notably, Ruppert et al. present the most impressive results in the literature with regard 
to accuracy in locating binding sites [2]. Their method uses an in-house empirical 
forcefield to dock three different types of probes (steric, H-bond donor, H-bond 
acceptor) against the protein binding site. This maps out a set of favorable "probe" 
positions and permits the identification of "sticky spots" on the protein, which are used 
as central points to carve-out individual pockets. Although they provide no CPU times, 
their algorithm requires significant docking and, thus, is probably considerably slower 
than PASS or LIGSITE. They apply this method to the prediction of binding sites in a 



36 



set of 1 1 PDB complexes and find that their top-ranked pocket contains the ligand in 
every case. Nine of these eleven cases, however, are included in the PASS test set 
(Table 3), and strikingly similar results are obtained with PASS. The top-ranked ASP is 
a binding site hit in eight of the nine overlapping trials, and the second ASP is a hit in 
the other case. Although factors such as electrostatics and hydrogen-bonding certainly 
contribute to the affinity of a ligand for a particular cavity, the perspective taken in PASS 
is that only the most ligand-lndependent contributions to binding (i.e. size, shape, and 
burial extent of cavities) should contribute to binding site prediction. Energetic factors 
that strongly modulate specificity should be addressed case-by-case, either manually 
by the user or via downstream software (e.g. docking). Thus, the PASS ASP regions 
are completely inclusive with regard to electrostatic and hydrogen-bonding character, 
with the intention that each will be reinvestigated individually in light of a particular 
application or desired complementarity. PASS'S success in predicting binding sites 
without electrostatics and hydrogen-bonding constitutes a remarkable restatement of 
the importance of solvation and sterics in binding. 

Conclusions 

PASS is a simple cavity detection tool that has utility in both virtual screening 
and interactive molecular modeling environments. PASS was shown to reliably predict 
the locations of known binding sites using a set of 20 apo-protein x-ray structures from 
the PDB, thereby establishing its utility as a front-end to fast docking and virtual 
screening. Furthermore, for the price of a thirty-second investment, PASS provides the 
user a meaningful view of the buried volumes in a protein, suggests alternate binding 
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sites, and simplifies detailed visualization of potential binding hot-spots. PASS is freely 
available in unix executable form (SGI Irix, SunOS. Linux) to all users via the Protein 
Data Bank web site under TDB-related Software" (http://www.pdb.bnl.gov/pdb- 
docs/softwa re . html) . 
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Appendix A - Three-Point Sphere Geometry 

The sphere placement algorithm in PASS hinges upon solution of the following 
geometry problem. Given three "base" spheres (i, j, and k) of known positions (R,, Rj, 
R^) and radii (Oj, o^, and aj, at what two positions (Rp) can a "probe" sphere of radius 

be placed so as to be exactly tangential to all three base spheres? We seek the 
general solution, in which none of the radii are necessarily equal and the coordinates of 
the base spheres are unconstrained. Figure A1 illustrates the situation: sphere 
perimeters are outlined, base sphere centers are labelled "i", 7'. "k", the "base plane" (i- 
j-k) is shaded, the probe sphere is shaded and labelled "p", and vectors are denoted 
with uppercase lettering while points and distances are in lowercase. The global origin 
coordinates is labelled "O", while a local frame is defined by unit vectors {x', y', z'}. 
There are, in general, two solutions for R^, one on either side of the base plane. 
However, one must first impose several conditions to ensure the existence of a 
solution. If any pair {i,j} of base spheres are too far apart, the probe will be unable to 

bridge the gap, so one must first ensure that |Rj -R,|<<T,.+cr^.+2cr^ , and likewise for 

pairs {i,k} and {j,k}. One must also make sure that no base sphere lies entirely within 
the volume occupied by the other two. With these conditions satisfied, the coordinates 
Rp of the two valid probe sphere positions may be written 

Rp=R,±/iz', (A.1) 

where h is the height of the probe above the base plane, and z' is a unit normal to this 
plane. To be precise, the local coordinate frame {x', y', z'} is right-handed, with x' 
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lying along Rj-R, and z' pointing out of tine base plane in the direction of x' x (Rk-R,). 
The right triangle i-b-p gives the height 



h=^{<J^^<7j']K-^,f . (A.2) 

The vector R^, from O to the point of projection of the probe onto the base plane, b, can 
be written vectorially as 

Rb=R.+(Tii-R>U. {A.3) 

which leaves T„ and U undetermined. In general, point b need not lie on the interior of 
triangle i-j-k, as drawn, but the equations are the same in either case. U can be 
eliminated from Eqn. (A.3) by observing that 

U(T.,-R.>V-(T.,-R,), (A.4) 

where V = T,^ - T,j, and U points in the direction of y'. Solving Eqn. (A.4) for U yields 

„ jT.^-T,jg.-«,)^ . 

The remaining vectors {T,,, T,^, T^J, which run from O to points {i^., t^^, t^J, are found by 
considering the triangles formed by two base spheres and the probe sphere. For 
instance, the triangle i-j-p comprises two right triangles, i-t..-p and j-t^^-p. Applying the 
Pythagorean theorem to each enables determination of the distance from i to t^ via a 
quadratic equation, which yields the desired vector 



=i(R,.R> (-^-^-^^^-(%^"^^ R,-R.). (A.6) 
2R^-RJ 



Swapping indices in Eqn. A.6 gives analogous equations for T,^ and T,^. The normal 
vector, n, to the plane of tangency (a-p-y) may also be of interest: 

n=C • [c7;r, X r . +o-;r . x r, +a;r, x r, -h{a-^r. + <j;,r. + aTr, )x r J, { A.7) 
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where cr;^ = cr^ ±cr^ . C = aJa^pOlalp , and n is not of unit magnitude. 



Figure A1 - Sphere Geometry 
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